rEMM: Extensible Markov Model for Data Stream Clustering in R
نویسندگان
چکیده
Clustering streams of continuously arriving data has become an important application of data mining in recent years and efficient algorithms have been proposed by several researchers. However, clustering alone neglects the fact that data in a data stream is not only characterized by the proximity of data points which is used by clustering, but also by a temporal component. The extensible Markov model (EMM) adds the temporal component to data stream clustering by superimposing a dynamically adapting Markov chain. In this paper we introduce the implementation of the R extension package rEMM which implements EMM and we discuss some examples and applications.
منابع مشابه
Application of Markov-Chain Analysis and Stirred Tanks in Series Model in Mathematical Modeling of Impinging Streams Dryers
In spite of the fact that the principles of impinging stream reactors have been developed for more than half a century, the performance analysis of such devices, from the viewpoint of the mathematical modeling, has not been investigated extensively. In this study two mathematical models were proposed to describe particulate matter drying in tangential impinging stream dryers. The models were de...
متن کاملCluster-Based Image Segmentation Using Fuzzy Markov Random Field
Image segmentation is an important task in image processing and computer vision which attract many researchers attention. There are a couple of information sets pixels in an image: statistical and structural information which refer to the feature value of pixel data and local correlation of pixel data, respectively. Markov random field (MRF) is a tool for modeling statistical and structural inf...
متن کاملIntroduction to stream: An Extensible Framework for Data Stream Clustering Research with R
In recent years, data streams have become an increasingly important area of research for the computer science, database and statistics communities. Data streams are ordered and potentially unbounded sequences of data points created by a typically non-stationary data generating process. Common data mining tasks associated with data streams include clustering, classification and frequent pattern ...
متن کاملA New Model to Speculate CLV Based on Markov Chain Model
The present study attempts to establish a new framework to speculate customer lifetime value by a stochastic approach. In this research the customer lifetime value is considered as combination of customer’s present and future value. At first step of our desired model, it is essential to define customer groups based on their behavior similarities, and in second step a mechanism to count current ...
متن کاملTemporal Structure Learning for Clustering Massive Data Streams in Real-Time
This paper describes one of the first attempts to model the temporal structure of massive data streams in real-time using data stream clustering. Recently, many data stream clustering algorithms have been developed which efficiently find a partition of the data points in a data stream. However, these algorithms disregard the information represented by the temporal order of the data points in th...
متن کامل